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Abstract. The whole complex process to obtain a protein encoded by a gene 
is difficult to include in a mathematical model. There are many models for 
describing different aspects of a genetic network. Finding a better model is 
one of the most important and interesting questions in computational biology. 
Sequential dynamical systems have been developed for a theory of computer 
simulation, and in this paper, a genetic sequential dynamical system is intro- 
duced. A gene is considered to be a function which can take a finite number 
of values. We prove that a genetic sequential dynamical system is a mathe- 
matical good description for a finite state linear model introduced by Brazma 

mm 



1. Introduction 

One important and interesting question in biology is how genes are regulated. 
The most important models for gene regulation networks are boolean models and 
differential equation based models. Boolean models jl8lll4l [T] describe the activity 
of genes using an element of Z2 — {0,1}", that is an n dimensional vector with 
entries in {0, 1}. Each entry Xi means the activation of the gene i. In the Boolean 
model we have the vector space Z2" with an attached function / : Z2" — > Z2". 
The iteration of / means the time passed. The properties of the digraph associated 
with these iterations are the characteristics of the network. There are different 
ways to generalize this model: using more than two possibilities for each gene, and 
second using several functions for each gene, (PBN). Recently, a new mathematical 
model Probabilistic Boolean Networks (PBN) was introduced by Ilya Shmulevich, 
|27| . This model introduced the probabilistic behavior in Boolean networks and 
has been used to predict the steady states of genetic networks in cancer cells, |28| . 
For other mathematical models see|12). 

If we assume that a gene has more than two levels of possibilities which are 
determined by the environment and the concentration of a particular substance in 
the network, then the activity of a gene i is taken from the set of natural numbers 
{0, 1, . . . , TO — 1}, where to > 1, [^31 1^- One of the problem to study genetic 
networks with more than two possibilities for each gene is to find a good way 
to describe the functions in the net. There are polynomial representations for 
this functions over a finite field and one of the most important results using the 
techniques of Computer Algebra is that we can find all the polynomial solutions. 
This ideas appeared for a first time in the Seminar of Reinhard Laubenbacher in 
Virginia Bioinformatic Tech ^| ^]. In these functions are called partially 
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defined functions and study over a finite field of elements. In OEI], it is proved 
tliat there exist polynomials solutions. 

The theory of sequential dynamical systems (SDS) was first introduced in |S1[71|H] 
as a mathematical abstraction of a simulated system by a computer. In |15[ I16| . 
Laubenbacher and Pareigis introduced a categorical framework for the study of 
SDS. In this paper, we describe a particular SDS for genetic networks. In addition, 
we present a mathematical background to use in the study of the Finite State Linear 
Model introduced by Brazma and Schlitt, 4 . Using partially defined function and 
the polynomial solutions we present a mathematical particular example over a finite 
field of three elements. 

This paper is organized as follow, in section 2, we describe the ideas of the Finite 
State Linear Model for Gene Regulation Networks introduced by Alvis Brazma in 
OE], and introduce a notation slightly different from the one used in 00]. In 
section 3, we compare the definitions of SDS (JSDj and the definition of parallel 
dynamical systems. In section 4, we define two different models: the first describes 
the mathematical aspects of the Brazma Model and the second (genetic sequential 
dynamical system) generalizes the Brazma model. 

2. Finite state linear model 

Gene expression is a two-step process: first, a single stranded messenger RNA 
(mRNA) is copied (transcribed) from the strand of a duplex DNA molecule that 
encodes genetic information. In the second step, the mRNA moves to the cyto- 
plasm, is complexed to ribosomes, and its genetic information is translated into the 
amino acid sequence of a polypeptide. 

The model (FSLM) in |3j considers the following definitions of a gene ( Section 
3.3, [5]) and gene regulation networks (Section 4.3, |Sj). 

Definition 2.1. A gene is a continuous stretch of a genomic DNA molecule, from 
which a complex molecular machinery can read information ( encoded as a string of 
A, T , G, and C ) and make a particular type of a protein or a few different proteins. 

Transcription factors control a gene expression by binding the gene's promoter 
and either activating (switching on) the gene's transcription, or repressing it (switch- 
ing off). Transcription factors are gene products themselves, and therefore in turn 
can be controlled by other transcription factors. Transcription factors can control 
many genes, and some (probably most) genes are controlled by combinations of tran- 
scription factors. Feedback loops are possible. Therefore we can talk about gene 
regulation networks. Understanding, describing and modelling such gene regu- 
lation networks are one of the most challenging problems in functional genomics. 

Now, we introduce the notation that we use in this paper. The network has n 
genes Gi , . . . , Gn . The binding sites are stages of the processes of transcription of a 
particular gene G^-, denoted by Bji, . . . , Bj^uj- Each binding site Bjk is determined 
by the concentration Cj(t) at time t, of a particular substance ij associated with 
or generated by a gene Gj . There are two constants for each state of the binding 
site Bjk, Ojk and djk, called association and dissociation constants respectively. 
That is, taking the real number Cjk{t) and depending on its relation to ajk and 
djk, we give a state for the binding sites Bjk. Each binding site Bjk can take a 
finite number of possibilities. For understanding the problem, Bjk is taking as a 
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finite subset of the set of the integers Z. Let denote by bjk G M a state of Bjk- In 
FSLM, they called Bjk a multistate binding site, and the vector bj = {bji, . . . , bjm) 
a binding site vector. The set of all possible vectors bj is the environment of the 
gene Gj, that is 

Bj=Bji x---xBjm, CZ™^. 

For each gene, there is a function Fj (the control function). Its inputs are 
the states of the binding sites, Bj. The function Fj takes one of the values of 
the binding sites, but the output is the production of a substance at a given rate 
which can act again over all the binding sites. Therefore we consider the control 
functions Fj : Bj Bj, whose output, a vector of Bj, changed by the production 
of a substance. The production of the substance is given by another function 
Cj -.R^ R. 

They make the following assumptions: 

(1) The activity of a gene is determined by the state bj G Bj of transcription 
factor binding sites in its promoter regions. 

(2) Each binding site can be in one of a finite number of states, characterized 
by having or not having bound a particular transcription factor [Bj is a 
finite set). 

(3) The state bjk of a binding site Bjk depends on the concentration Cj{t) of 
the respective transcription factors. 

(4) Depending on the state bj of the binding site Bj a gene can either be silent 
or have a particular activity level. 

(5) If a gene Gj is active, the concentration Cj{t) of the substance ij that it 
produces is linearly growing with a particular rate, otherwise it is decreasing 
(or stays at 0). 

Their multistate generalization is the following: 

(a) A binding site can competitively bind more than one substance and there- 
fore can have more than two states. 

(b) A gene can have more than two levels of activity. 

(c) A control function is not a boolean function, but a mapping which maps a 

vector of integers into an integer. 

In FSLM model, multiple transcription factors can act on several binding sites 
to produce a finite output state for a gene. The finite output state translates to a 
particular growth or decrease rate (real valued) for a gene product. Transcription 
factors are gene products like any other gene product, and are measured with a 
real- valued concentration {cj{t)). The concentration of transcription factors deter- 
mines the finite state of each binding site. Time in the model is continuous, but 
measurements are made at a finite number of discrete intervals. The measurements 
that are performed is of the concentration of the gene products. 

This model is a simplification of the true biological process in which the RNA 
produced by transcription is later translated into proteins which have their own 
rate of decay. Proteins and other cellular species can also interact and activate or 
deactivate each other besides interacting with the binding sites. 

This model has two aspects. One is discrete, given by the control functions. But, 
the production of the substance at a given rate is continuous. 



4 



M. A. AVINO, H. ORTIZ, AND O. MORENO 



3. Sequential Dynamical Systems 

For better understanding the next section we recall some definitions of graphs 
and sequential dynamical systems 

Let X be a set. Let p2{X) be the set of all two-element subsets of X. 

Definition 3.1. A (loop free, undirected, finite) graph G = {Vg,Eg) consists of a 
finite set Vg of vertices and a subset Eg ^ p2{X) of edges. 

Let G be a graph. A 1-neighborhood N{A) of a vertex a e Vg is the set 

N{a) {b e VgKo, b} e Eg or a ^ b}. 

Let Vg = {ai, . . . , Let (fc[ai], Ui G Vg) be a family of sets. Define 

fc" :— k[ai] X . . . k[a„] = JJ^ k[ai\, 

the set of global states of G. 

Definition 3.2. A function f : k"- ^ fc" is called local at Ui G Vg if 

/(xi , . . . , Xn) — (^1 7 ■ • ■ ^ -^i — l ^ f '^n) 5 5 ■ ■ ■ i -^n) ; 

where f^{xi, . . . G fc[ai] depends only on the variables in the 1-neighborhood 
N{ai) of the vertex a^. 

Definition 3.3. A sequential dynamical system (SDS), — {Y, {k[ai]) , {fi) , a) 
consists of 

(1) a finite graph Y with n vertices, 

(2) a family of sets {k[ai], G Vy) in Z, 

(3) a family of local functions {fi : fc" — > fc", where fi local at ai), 

(4) and a word a = ay = {cti, • ■ • , Qr) G in the Kleene closure of the set of 
vertices Vy, called an update schedule ( i.e. a map a : {1, . . . , r} — > Vy). 

The world a is used to define the global update function of an SDS as the function 

= /a. ° fc" ^ fc". 

The length of the update schedule a = (ai, • ■ • , ar) is r. The global update func- 
tion of an SDS defines its dynamical behavior, properties of limit cycles, transients, 
etc.. 

Definition 3.4. A parallel dynamical system or a finite dynamical system is a 
function F : k" ^ k". 

Remark 3.5. Every parallel system can be represented as a sequential system by 
doubling the number of nodes and first copying the old states to the new variables. 
Conversely, after compose all the local update functions in a sequential system, then 
the global update function F : fc" ^ /c" has coordinate functions (different from 
the local update functions in general), and we can think of the system as being 
a parallel system given by the coordinate functions. So the two representations 
are equivalent. If the system one wants to model is naturally sequential, then the 
representation as an SDS is generally better because it makes important system 
properties explicit. 
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4. Two MODELS 

In this section we define two models for genetics networks. First we introduce 
the definition of a translated function. 

Definition 4.1. A vector function 6fg = {di, . . . ,Sn) ■ K" — > Z" is a translated 
function between two vector functions g = {gi,...,gn) : M" M", and / = 
(/!,...,/„) :Z"^Z" if: 

St ° rn ^ fi° Si, for alH = 1, . . . , n. 
That is, if the following diagram commute 

Si Si 

Z" -^f Z" 

Definition 4.2. A finite state model (FSM) T = {Y, {Bj), {Fj), {cj),SFc}, consists 
of 

(1) a finite graph Y with n vc;rticcs, Y is the supported graph of relations 
between the n genes Gj with vertices Vy = {gi, ...,5„}, 

(2) a family of finite sets Bj (binding sites), for each gj G Vy, 

(3) a vector function F = (Fi, . . . , F„) : H"^^ Bj n"=i Bj such that Fj : 
rijLi Bj, 

(4) a vector function c = (ci, . . . ,c„) : K" — > K", where M is the set of real 
number, 

(5) a translated function Spc between the two vector functions. 

Definition 4.3. Let {tg, ii, • • • , be a set of real numbers, such that < < 
• • • < tn- Let (ao, bo), (ai, hi),. . . , (a„, 6„) be n + 1 pair of real numbers. Suppose 
that 

aoh +bo = aih + bi 
ait2 + bi = a2ti + 62 

an-2^n-l + bn-2 = O-n-ltn-l + 

We call the function 

if t < to or t > tn 

ait + bi for t€[ti,ti+i],i = 0,l,...,n-l 

a sectional linear function. 



c{t) 



As a consequence of section 2 we have proved part of the following theorem. 

Theorem 4.4. Brazma-Schlitt Model is a finite state model, where the functions 
Cj are sectional linear functions. 

Proof. Here, we only need to see that the functions cj give the concentration of the 
substance. □ 

We assume the following: 

(1) there are n genes in the network A'^, 

(2) for each 1 < j < n we have rrij G Z"*" binding sites {Bji, . . . , Bj^j}, 

(3) Bjh is a finite set, and Bjk c Z, for all j, and k, 
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(4) one gene can be interact with another gene, and we describe this situation 
by a graph Y, with set of vertices Vy — {gi, . . . ,gn}, 

(5) the environment of the network N is the set 

n 

B ^Yl_^3^ where Bj ^ Bji x ■ ■ ■ x Bjm^ , 
i=i 

(6) for each gene gj we have a local function, Fj : B ^ B, in the sense of 
Definition EH 

For all function F : 17^ ^ W\ we define the coordinated functions Fi as follows: 

F{x\^ . . . , Xji ) — (-^1 (^1 1 ■ ■ ■ 5 •^n );■•■; (.-^l : ■ • ■ : -^n) ) • 

Definition 4.5. A genetic sequential dynamical system (GSDS) consists of = 
(Y, (Bj), (fj), (cj), a, S), where 

(1) Y is the support graph of relations between genes with vertices Vy = 
{.91, ■ • • ,5«}, 

(2) is a finite set, for all j, and B = Ylj=i Bj, 

(3) a family of local functions fj:B^ B, (genetic functions), 

(4) a word a with the order of interaction of functions, that is a function 

F = fain) O • • • O fad) = (i^l , • . • : ^ 

(5) a vector function c= (ci, . . . , c„) : M" M", 

(6) a translated function Spc- 

Definition 4.6. A genetic network is a pair T — {F,c). A genetic network is 
compatible if there exists a translated function Spc- 

Tileorem 4.7. The genetic sequential dynamical system is a generalization of the 
Brazma-Schlitt model. 

Proof. In order to prove the theorem, we see how all the considerations of Brazma- 
Schlitt (BS) model are included in the definition of GSDS. 

The interaction between genes is given by a graph Y with n vertices gi, ... , gn 
and an edge {gi, gk} if the gene i has any relation with the gene k. So, we have the 
first condition of Definition 14.51 

We can observe that in Definition 12.11 a gene in action is a complex molecular 
machinery. So a gene is a function that can read information and make a par- 
ticular type of protein. Where does a gene read the information? In the binding 
sites Bj and the protein again changes the environment Bj. The Brazma model 
considers functions Fj from Bj to Bj. Since the protein can make changes in the 
1-neighborhood, the genetic function Fj is from B to B, and it is a local function. 
In the BS model, the binding sites are finite sets. So, we have conditions 2, and 3. 

The genes act in an order, which implies an order in the composition of those 
genetic functions. Thus condition 4 holds. 

The inclusion of a family of functions {cj) and the translated functions in the 
definition of GSDS gives the possibility to see the continuous and discrete sides of 
the genetic networks. 

Then our claim holds. 

□ 
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5. Examples 

In all the examples, we obtained the functions using partially defined functions 
and polynomial representation, see [2]. 

Example 5.1. We describe the Boolean model presented in |11[I21| using a GSDS. 
In this example the data is given with discrete values: and 1. 

91 93 

(1) The digraph: Y \ t 

.90 92 

(2) X = Z2 = {0,1}. 



(3) The functions are the following: 

(4) The schedule a = 



/o(a;o,a;i,a;2,X3) = (1, a;i, X2, X3) 

fi{xo,xi,X2,X3) = (xo, l,a;2,a;3) 

f2{xo,Xl,X2,X3) = (xo,Xi,a;oXi,X3) 

f3{xo,xi,X2,X3) = (xo,xi,a;2,a;i(a;2 + 1)) 



12 3 
3 2 10^ 

(5) The global function f = fo o h o f2 o : ^ X\ 

f{xo,xi,X2,X3) = (1, l,a;oa;i,a:i(a;2 + 1)). 

We can observe that if we change the order we can not obtain the same function. 

Remark 5.2. If we have only three states for genes we have the finite field Z3 = 
{ — 1, 0, 1}, that is the integers modulo 3, with 1 + 1 = —1. If we have four possible 
states for genes we can use a finite field with 4 elements. A finite field GF(4) 
can be represented as: GF{4) = {0, 1, a, a^}, where a is a root of the polynomial 
+ z + 1, that is = a + 1 (with coefficients in Z2 ~ {0,1}). We denote 
— 00, 1 = 01, a = 10, Qf^ = 11, then the operations + and x are the follows: 

+ 00 01 10 11 X 00 01 10 11 

00 00 01 10 11 00 00 00 00 00 

01 01 10 11 00 01 00 01 01 11 

10 10 11 00 01 10 00 10 11 01 

11 11 10 01 00 11 00 11 01 10 

Example 5.3. We describe the example which appear in ^2 of the generalized 
logical method developed by Thomas and colleagues j^SJ. Here we use the FSM 
and in this case it is not linear and we have the data with two or three values in 
the integers. We have three genes, and the regulatory network is the following: 

93 O 

(1) The digraph: Y / i 

91 ^ 92 

(2) For genes g\ and 33 we have Z3 — {0, 1,2}, and for gene 172 we have X2 — 

{0,1}, 

(3) The functions are: 

!\(xy,X2,Xz) = -X2 
!2(xy,X2,Xz) = l+xfxl 

f3{xi,X2,X3) = 2 + xi + 2x3 + a;ia::3 + 2x1 + x§ + 2a;f a;3 + 2a;ia;3 

(4) The global function 

F = (/i, /2, /a) : Z3 X X2 X Z3 ^ Z3 X X2 X Z3 
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such that 

F{xi,X2,X3) = (-X2, 2+xi+2x3+XiX3 + 2xl+xl + 2xlxs + 2xixl+xlxl). 

Example 5.4. We assume that by microarray experiment we have the following 
data, and denote the concentration of each gene j by Cj{t): 

t G\ G2 G3 

ci(0) = 0.5 C2(0) = 1.2 C3(0) = 0.5 

1 ci(l) = 0.78 C2(l) = 1.2 C3(l) = 1.25 

2 ci(2) = 1.5 C2(2) = 1.2 C3(2) = 1.5 

3 ci(3) = 0.5 C2(3) = 1.2 C3(3) = 0.5 

The vector of concentrations is c{t) = {cx{t) , C2{t) , c^it)) , and if we suppose that 
the functions Cj are sectional linear functions then 

( (0.28t + 0.5,1.2,0.75t + 0.5), w/ien i £ [0, 1] c M ] 
c{t)=l (0.72t + 0.06, 1.2, 0.25t+l), when t e [1,2] ^M. \ 
\ (-t + 3.5,1.2,-t+3.5), w/ien f e [2, cx)) C M J 

On the other hand, we will assume that the average concentration level of d is 
0.78, G2 is 0.75, and G3 is 1.5. So, we give values to the states of genes Gi, G2, 
and G3. 

5\ : { less than 0.78 1-^ —1, 0.78 1— > 0, more than 0.78 1-^ 1} 
^2 : { less than 0.75 1— > —1, 0.75 more than 0.75 1— > 1} 

83 : { less than 1.25 1-^ —1, 1.25 1-^ 0, more than 1.25 ^ 1} 

We suppose that we have for each gene Gj a binding site Bj = X = { — 1,0,1}, 
and we consider the operations in the finite field X = Z3. Our problem is the 
following: we know c{t) by microarray experiment, we suppose the vector c{t) is a 
vector of sectional affine functions, but we want to obtain a function / such that 
/(-1, 1,-1) = (0,1,0) for t = 1, /(0,1,0) = (1,1,1) for t = 2, and /(1, 1,1) = 
(—1,1,-1) for t ~ 3. In this case, one of the possible functions is f{xi,X2,xs) = 
(xi + X2,X2,X3 + X2)- In addition, we can obtain a graph Y if we observe how the 
genes change with the action of F. 

93 O 

Y T 

O 51 ^ .92 O 

Now, we have a GSDS ^ = {Y, Z3, {fj}, {5j), {cj),a) for this dataset: 

(1) a collection xi,X2,X3 of variables, which take on values in a finite field 

X = Z3 

(2) Y is the support directed graph of relations between genes with vertices 
{91,92,93}, 

(3) for each j = 1, 2, 3, the local update functions 

fl{xi,X2,X3) = {xi +X2,X2,X3), f2{xi,X2,X3) = {xi,X2,Xs), 

and /3(a:i,X2,a;3) = (xi, a;2, a;2 + a;3). 

(4) a schedule a = 

(5) the global function / = fs o /2 o fi : X^ X^ obtained by the schedule a, 

(6) a vector function c{t) = {ci{t),C2{t),C3{t)), 

(7) the translated functions 5i. 



(Ill) 
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